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For paving the way to novel apphcations in quantum simulation, computation, and technology, 
increasingly large quantum systems have to be steered with high precision. It is a typical task 
amenable to numerical optimal control to turn the time course of pulses, i.e. piecewise constant 
control amplitudes, iteratively into an optimised shape. Here, we present the first comparative 
study of optimal control algorithms for a wide range of finite-dimensional applications. We focus on 
the most commonly used algorithms: grape methods which update all controls concurrently, and 
Krotov-type methods which do so sequentially. Guidelines for their use are given and open research 
questions are pointed out. — Moreover we introduce a novel unifying algorithmic framework, 
DYNAMO [dynamic optimisation platform) designed to provide the quantum-technology community 
with a convenient MATLAB-based toolset for optimal control. In addition, it gives researchers in 
optimal-control techniques a framework for benchmarking and comparing new proposed algorithms 
to the state-of-the-art. It allows for a mix-and-match approach with various types of gradients, 
update and step-size methods as well as subspace choices. Open-source code including examples is 



made available at http://qlib.info 



PACS numbers: 03.67.Lx; 02.60.Pn; 02.30.Yy, 07.05. Dz 



I. INTRODUCTION 

For unlocking the inherent quantum treasures of future 
quantum technology, it is essential to steer experimental 
quantum dynamical systems in a fast, accurate, and ro- 
bust way [l|, 01 . While the accuracy demands in quantum 
computation (the 'error-correction threshold') may seem 
daunting at the moment, quantum simulation is far less 
sensitive. 

In practice, using coherent superpositions as a re- 
source is often tantamount to protecting quantum sys- 
tems against relaxation without compromising accuracy. 
In order to tackle these challenging quantum engineering 
tasks, optimal control algorithms are establishing them- 
selves as indispensable tools. They have matured from 
principles Q and early implementations 0^ via spec- 
troscopic applications 043 to advanced numerical algo- 
rithms [lol . for state-to-state transfer and quantum- 
gate synthesis [l^ alike. 

In engineering high-end quantum experiments, 
progress has been made in many areas including cold 
atoms in optical lattice potentials [l3j_14|, trapped ions 



[l5l - [2l| , and superconducting qubits 22 , 23| to name just 
a few. To back these advances, optimal control among 
numerical tools have become increasingly important, 
see, e.g., [l^l for a recent review. For instance, near 
time-optimal control may take pioneering realisations 
of solid-state qubits being promising candidates for 
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a computation platform l25ll. from their fidelity-limit 
to the decoherence-limit [2g. More recently, open 
systems governe d by a Markovian master equation have 
been addressed [27|, and even smaller non-Markovian 
subsystems can be tackled, if they can be embedded into 
a larger system that in turn interacts in a Markovian 
way with its environment |28|. Taking the concept of 
decoherence-free subspaces |29l. [soj to more realistic 
scenarios, avoiding decoherence in encoded subspaces 
(3l| complements recent approaches of dynamic error 
correction [s^. [33|. — Along these lines, quantum control 
is anticipated to contribute significantly to bridging the 
gap between quantum principles demonstrated in pio- 
neering experiments and high-end quantum engineering 



Scope and Focus 

The schemes by which to locate the optimal control se- 
quence within the space of possible sequences are varied. 
The values taken by the system controls over time may be 
parameterised by piece- wise constant control amplitudes 
in the time domain, or in frequency space (3^ . by splines 
or other methods. For specific aspects of the toolbox of 
quantum control, see e.g. [Ill [ll, [H, [H, [M [Ml, [H- 
kSfl . while a recent review can be found in [46|. Here, 
we concentrate on piece-wise constant controls in the 
time domain. For this parametrisation of the control 
space, there are two well-established optimal control ap- 
proaches: Krotov-type methods jH, [s^, 113, IH| which 
update all controls within a single time slice once be- 
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fore proceeding on to the next time slice (cycling back to 
the first slice when done), and GRAPE- type methods ^f| 
which update all controls in all time slices concurrently. 
Here we refer to the former as sequential-update schemes 
and to the latter as concurrent-update schemes. 

Sequential methods have mainly been applied to pro- 
vide control fields in (infinite-dimensional) systems of 
atomic and molecular optics characterised by energy po- 
tentials m, [13, m, [13 J while concurrent methods have 
mostly been applied to (finite-dimensional) qubit systems 
of spin natur e flll .[l^. or to Josephson elements [26, 2^, 
ion traps 15 ll. 1521 , or 2D-cavity grids in quantum electro- 
dynamics |53| . Here we compare sequential vs. concur- 
rent algorithms in finite-dimensional systems. 

Both of the methods require a mechanism to control 
the selection of the next point to sample. For sequential- 
update methods, which perform a single or few iterations 
per parameter subspace choice, first-order methods are 
most often used; yet for algorithms repeatedly modifying 
the same wide segment of parameter space at every itera- 
tion, second-order methods, such as the well-established 
one by Broyden-Fletcher-Goldfarb-Shanno (bfgs) [H], 
seem better-suited. These choices, however, are by no 
means the final word and are subject of on-going research. 

Controlling quantum systems via algorithms on classi- 
cal computers naturally comes with unfavourable scaling. 
Thus it is essential to optimise the code by minimising the 
number of operations on matrices which scale with the 
system size, and by parallelising computation on high- 
performance clusters. While elements of the latter have 
been accomplished [l^, here we focus on the former. 

To this end, we present a new unifying programming 
framework, the DYNAMO platform, allowing to combine 
different methods of subspace selection, gradient calcula- 
tion, update controls, step-size controls, etc. The frame- 
work allows for benchmarking the various methods on a 
wide range of problems in common usage, allowing fu- 
ture research to quickly compare proposed methods to 
the current state-of-the-art. It also makes significant 
strides towards minimising the number of matrix op- 
erations required for serial, concurrent, and generalised 
hybrid schemes. Full MATLAB code of the platform is 
provided to the community alongside this manuscript 
at http://qlib.info, — We benchmark Krotov-type al- 
gorithms and GRAPE algorithms over a selection of sce- 
narios, giving the user of control techniques guidelines as 
to which algorithm is appropriate for which problem. 

The paper is organised as follows: In Sec. II we provide 
a generalised algorithmic framework embracing the es- 
tablished algorithms GRAPE and Krotov as limiting cases. 
Sec. HI shows how the formal treatment applies to con- 
crete standard settings of optimising state transfer and 
gate synthesis in closed and open quantum systems. In 
Sec. IV we compare the computational performance of 
concurrent vs. sequential update algorithms for a number 
of typical test problems of synthesising gates or cluster 
states. Computational performance is discussed in terms 
of costly multiplications and exponentials of matrices. 



Sec. V provides the reader with an outlook on emerg- 
ing guidelines as to which type of problem asks for which 
flavour of algorithm in order not to waste computation 
time. — Finally, we point at a list of open research ques- 
tions, in the persuit of which dynamo is anticipated to 
prove useful. 

II. ALGORITHMIC SETTINGS 

Most of the quantum control problems boil down to 
a single general form, namely steering a dynamic sys- 
tem following an internal drift under additional external 
controls, such as to maximise a given figure of merit. Be- 
cause the underlying equation of motion is taken to be 
linear both in the drift as well as in the control terms, dy- 
namic systems of this form are known as bilinear control 
systems (E) 

m 

X{t)^-{A + Y,u,{t)B,) X{t) (I) 
j=i 

with 'state' X{t) G C^, drift A € Matjv(C), controls 
Bj € Mat TV (C), and control amplitudes Uj{t) € M. 
Defining the Au{t) := A + X]j=i generators, 
the formal solution reads 

t 

X(t) = Tcxp{-y"drA„(r)}X(0) , (2) 



where T denotes Dyson's time ordering operator. — In 
this work, the pattern of a bilinear control system will 
turn out to serve as a convenient unifying frame for ap- 
plications in closed and open quantum systems, which 
thus can be looked upon as a variation of a theme. 

A. Closed Quantum Systems 

Throughout this work we study systems that are 
fully controllable (56l46^ . i.e. those in which — neglecting 
relaxation — every unitary gate can be realised. Finally, 
unless specified otherwise, we allow for unbounded con- 
trol amplitudes. 

Closed quantum systems are defined by the system 
Hamiltonian Hd as the only drift term, while the 'switch- 
able' control Hamiltonians Hj express external manipu- 
lations in terms of the quantum system itself, where each 
control Hamiltonian can be steered in time by its (here 
piece- wise constant) control amplitudes Uj{t). Thus one 
obtains a bilinear control system in terms of the con- 
trolled Schrodinger equations 

m 

m)) = -i{Hd + Y.u,{t)H,)m)) (3) 

m 

IJ{t) ^ -t{Hd + Y.u,{t)H,)U{t) , (4) 
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where the second identity can be envisaged as hfting the 
first one to an operator equation. For brevity we hence- 
forth concatenate all Hamiltonian components and write 

H^{t):=H, + Y,Mt)Hj ■ (5) 

Usually one wishes to absorb unobservable global phases 
by taking density-operator representations of states p{t). 
Their time evolution is brought about by unitary conju- 
gation U{-) ■.= U{-)W = Adu{-) generated by commuta- 
tion with the Hamiltonian Hu{-) ;= [Hu,{-)] = ad//^(-). 
So in the projective representation in Liouville space, 
Eqns. © and (H]) take the form 

m = -tH.ap{t) (6) 

f^Uit) = ~iHuU{t) . (7) 
It is now easy to accommodate dissipation to this setting. 

B. Open Quantum Systems 

Markovian relaxation can readily be introduced on the 
level of the equation of motion by the operator F, which 
may. e.g., take the GKS-Lindblad form. Then the respec- 
tive controlled master equations for state transfer and its 
lift for gate synthesis read 

p{t) = -{iHu + F) p{t) (8) 
F{t) - -{iH,, + F) F{t) . (9) 

Here F denotes a quantum map in GL(N2) as linear image 
over all basis states of the Liouville space representing the 
open system, where henceforth := 2" for an n-qubit 
system. Note that only in the case of [Hu , F ] = the 
map F{t) boils down to a mere contraction of the unitary 
conjugation U{t). In the generic case, it is the intricate 
interplay of the respective coherent (iHu) and incoherent 
(F) part of the time evolution [U that ultimately entails 
the need for relaxation-optimised control based on the 
full knowledge of the master Eqn. 



Table I: Bilinear Quantum Control Systems 



Setting and Task 
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More precisely, observe there are two scenarios for re- 
alising quantum gates or modules UiT) £ SU(N) with 
maximum trace fidelities: Let 

g:^jjtr{uL^,,UiT)} (II) 

define the normalised overlap of the generated gate UiT) 
with the target. Then the quality function 

fsv := i Rctr{uLs,,U{T)} = Reg (12) 

covers the case where overall global phases shall be re- 
spected, whereas if a global phase is immaterial , an- 
other quality function fpsv applies, whose square reads 

/|su:=l7^Retr{[/4g,t&(r)}=|,9|' . (13) 

The latter identity is most easily seen [l2| in the so-called 
wee-representation [U of p, where U — U®U£ PSU(N) 
(with JJ as the complex conjugate) recalling the projec- 
tive unitary group is PSU(N) = -^j^ = Now 
observe that tr{(C7 ® U)iV ® V)} ^ tT{UV ® UV} = 
\tr{UV}\\ 



D. Core of the Numerical Algorithms: 
Concurrent and Sequential 



C. Figures of Merit 

In this work, we treat quality functions only depending 
on the final state XiT) of the system without taking into 
account running costs, which, however, is no principal 
limitation [93 |. 

No matter whether the X{t) in Eqn. ^ denote states 
or gates, a common natural figure of merit is the projec- 
tion onto the target in terms of the overlap 

9-w±ir.''^^UotXiT)} . (10) 

Depending on the setting of interest, one may choose as 
the actual figure of merit /su ■= Re g or /psu '■= \g\- 



Since the equations of motion for closed and open 
quantum systems as well as the natural overlap-based 
quality functions are of common form, we adopt the uni- 
fied frame for the numerical algorithms to find optimal 
steerings {ujit)}. To this end, we describe first-order 
and second-order methods to iteratively update the set 
of control amplitudes in a unified way for bilinear control 
problems. 

Discretising Time Evolution 

For algorithmic purposes one discretises the time evo- 
lution. To this end, the control terms Bj are switched 
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Table II: List of Symbols 



Symbol Meaning 



j control Hamiltonian index (1. . .m) 

k time slice index (1. . .A4) 

Uj{tk) control amplitude to Hamiltonian j 

in time slice k (more lables below) 
A non-switchable drift term (see Tab. [!} 

Bj switchable control terms (see Tab. |T| 

Xo initial condition (see Tab. IIIII) 

-^target = Xm+1 final Condition (see Tab. IIIII) 
Xk propagator from time tk-i to tk 

Xi^;0 forward propagation of initial state 

up to time tk, i.e. XkXk-i ■ ■ ■ XiXq 
XM+i:k+i 6ocfc«;ard propagation of target state 

up to time tk, i.e. x1^^j,^^.Xm ■ ■ ■ Xk+i 



q subspace selection counter (outer loop) 

s step-within-subspace counter (inner loop) 

r global counter (overall number of steps) 

■fii) c {1. . .M} set of time slices belonging to subspace q 
M'^^ number of time slices in T''-* 

t[^^ tag members of T'"' with fc G {1. . .M'")} 

u'p it\^^) control amplitude to Hamiltonian j for 

subspace q, time slice iteration r 
f figure(s) of merit 



by piecewise constant control amplitudes Uj{tk) G W C M 
with tk € [0,T], where T is a fixed final time and U de- 
notes some subset of admissible control amplitudes. For 
simplicity, we henceforth assume equal discretised time 
spacing Ai := tk—tk-i for all time slices k ~ 1,2, ... , M. 
So T = MAt. Then the total generator (i.e. Hamiltonian 
or Lindbladian) governing the evolution in the time in- 
terval {tk-i,tk] shall be labelled by its final time tk as 

Au{tk):=A + J2^j{tk)B, (14) 
j 

generating the propagator 

Xk := e-^*^-(*'=) (15) 

which governs the controlled time evolution in the time 
slice (tk-i,tk]. Next, we define as boundary conditions 
X{0) Xq and Xm+i := ^target- They specify the 
problem and are therefore discussed in more detail in 
Sec. [ml Tab. Hill A typical problem is unitary gate syn- 
thesis, where Xq = 1 and Xtarget = C^target, whereas in 
pure-state transfer Xq = {ipo) and ^target = |V')targct- 
— In any case, the state of the system is given by the 
discretised evolution 

X{tk) = Xk;Q := XkXk-i ■ ■ ■ -^1^0 • (16) 

Likewise, the state of the adjoint system also known as 
co-state K^tk) results from the backward propagation of 



(a) concurrent (cRAPE-type) 




(b) sequential (Krotov-type) 




Figure 1: (Colour online) Overview on the update schemes 
of gradient-based optimal control algorithms in terms of the 
set of time slices T'"' = {k[^\k''^\. . . fc^f;,,} for which the 
control amplitudes are concurrently updated in each itera- 
tion. Subspaces are enumerated by q, gradient-based steps 
within each subspace by s, and r is the global step counter. 
In GRAPE (a) all the M piecewise constant control amplitudes 
are updated at every step, so T^^' = {1, 2, . . . M} for the sin- 
gle iteration q=l. Sequential update schemes (b) update a 
single time slice once, in the degenerate inner-loop s=l, be- 
fore moving to the subsequent time slice in the outer loop, 
q; therefore here T'^-* — {q mod M}. Hybrid versions (c) 
follow the same lines: for instance, they are devised such as 
to update a (sparse or block) subset of p different time slices 
before moving to the next (disjoint) set of time slices. 



Xm+1 = ^target 

h^{tk) ■■= xI^^^^^XmXm-1 ■ ■ ■ Xk+i ^^^^ 
~ X\^j^^XmXm-\ ■ ■ ■ Xk+\ =: Xu+i-.k+Y 

which is needed to evaluate the figure of merit here taken 
to be 

/ := ^|tr{At(t,)X(ife)}| = |tr{xLgetA(T)}| Vfc (18) 

as the (normalised) projection of the final state under 
controlled discretised time evolution upto time T onto 
the target state. 
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Algorithmic Steps 

With the above stipulations, one may readily charac- 
terise the core algorithm by the following steps, also il- 
lustrated in Fig. [T] and the flowchart in Fig. m 



(0) Initialisation 


Set uf\l. . .M) as i 
Set counters r = 0, ( 


litial amplitudes. 
= 0, s = 1 . 







0. Set initial control amplitudes e C M for 

aU times tk with k e T*"^ := {1,2,. ..,71/} then set 
counters ?■ = 0, q = 0, s = 1; fix siimit and /'. 

1. Outer loop starts enumerated by q: 

Unless r = q = 0, choose a selection of time slices, 
i.e. a subspace, T^'^, on which to perform the next 
stage of the search will update only u^p{t'^^^) for 



t 



(9) 



fce{i...j\/(9)} 

2. Inner loop, enumerated by s: 

Take one or more gradient-based steps within the 
subspace. Depending on subspace choice, num- 
ber of matrix operations may be reduced as com- 
pared to the naive implementation of the algo- 
rithm. 



3. Exponentiate: X 



(r) 



gAtA('-)(t(") for all k e 



r(') with ^I'-^ti'^) ■.= A + (4'^)s, 

4. Compute goal function at some k = k: 



5. Forward propagation: 

6. Backward propagation: 

7. Evaluate current fidelity: 



■X 



(r) 
k+l 



8. If > 1 - Ethrcshoid, done: goto step 13 



9. Else, calculate gradients 

k e r(«) 

10. Gradient-based update step: 



for all 



(4'^') for aU k e r(«) by a method of 
choice (e.g., Newton, quasi-Newton, bfgs or 
L-BFGS, conjugate gradient etc.) 



duj II 

>s + 1, r- 



< f' Vfc e r' 



(r) 



1 and return 



11. If s < siimit and 
then set and s- 
to step 3 

12. q — >q+l. Choose a new subspace T^"^^ and return 
to step 2 

13. Output: 

final control vectors {u^^\tk)\k = 1,2,..., M} for all 
controls j, final quality /'""^ final state X'^^^T), and 
diagnostic output. 

14. Terminate. 



(1) Outer loop init. 
Select subspace X^^' 



(2) Inner Loop 



(3. . .7). Calc. propagator and goals 



(3) Exponentiate 

xM=e^«-*l'^'"L'"'forallfcer" 

(5) Propagate forward: X^Tq 

(6) Propagate backward: ^^m+i-k+i 

(7) Calculate quality function /'^^ 




Optimisation 
goal achieved 
Goto 13 



(9, 10) Calc: Evaluate gradient 



(9) Evaluate {X^ (t^^^ )) 

(10) Update u^^^{t^i^^) by some method 
of choice 




Cont. outer loop 



(12,2) 

q >g+ 1 

r >r + 1 



Cont. inner loop 



(12,1) 

r >.? + 1 

■ >r + 1 



(13) Output results; (14) Terminate 



final control vectors {tk) Vj = 1. . .m, = 1. . .A/, 

final quality Z'""', final state ^''"'(T), and 
diagnostic output. 



Figure 2: Flow diagram for the generalised dynamo optimal 
control search embracing standard grape and Krotov meth- 
ods as limiting special cases. 
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Having set the frame, one may now readily compare 
the Krotov and GRAPE approaches: In Krotov-type al- 
gorithms, we make use of a sequential update scheme, 
where 7"^*^ = {q mod M} and sumit = 1, implying the 
inner loop is degenerate, as only a single step is per- 
formed per subspace selection, giving s=l,r = s. With 
GRAPE, a concurrent update scheme, T^'^-' = {1. . -M}, 
i.e. the entire parameter set is updated in each step of 
the inner loop, implying q=l, r = s and the outer loop 
is degenerate. 

The above construction naturally invites hybrids: algo- 
rithms where the subspace size is arbitrary in the 1. . .A/ 
range and where the size of the subspace to be updated in 
each step q as well as the number of steps within each sub- 
space, s, can vary dynamically with iteration, depending, 
e.g., on the magnitude of the gradient and the distance 
from the goal fidelity. This is a subject of on-going re- 
search. 



E. Overview of the dynamo Package and Its 
Programming Modules 

DYNAMO provides a flexible framework for optimal- 
control algorithms with the purpose of allowing (i) quick 
and easy optimisation for a given problem using the ex- 
isting set of optimal-control search methods as well as 
(ii) flexible environment for development of and research 
into new algorithms. 

For the first-use case, the design goal is to make 
optimal-control techniques available to a broad audience, 
which is eased as DYNAMO is implemented in MATLAB. 
Thus to generate an optimised control sequence to a spe- 
cific problem, one only needs modify one of the provided 
examples, specifying the drift and control Hamiltonians 
of interest, choose GRAPE, Krotov, or one of the other 
hybrid algorithms provided, and wait for the calcula- 
tion to complete. Wall time, CPU time, gradient-size and 
iteration-number constraints may also be imposed. 

For the second use case — developing optimal-control 
algorithms — DYNAMO provides a flexible framework al- 
lowing researchers to focus on aspects of immediate in- 
terest, allowing dynamo to handle all other issues, as 
well as providing facilities for benchmarking and com- 
paring performance of the new algorithms to the current 
cadre of methods. 



Why a Modular Programming Framework ? 

The explorative findings underlying this work make a 
strong case for setting up a programming framework in 
a modular way. They can be summarised as follows: 

(a) There is no universal single optimal-control al- 
gorithm that serves all types of tasks at a time. For 
quantum computation, unitary gate synthesis, or state- 
to-state transfer of (non)-pure states require accuracies 



beyond the error-correction threshold, while for spec- 
troscopy improving robustness of controls for state-to- 
state transfer may well come at the expense of lower 
maximal fidelities. 

(b) Consequently, for a programming framework to be 
universal, it has to have a modular structure allowing to 
switch between different update schemes (sequential, con- 
current and hybrids) with task-adapted parameter set- 
tings. 

(c) In particular, the different update schemes have 
to be matched with the higher-order gradient module 
(conjugate gradients, Newton, quasi- Newton). For in- 
stance, with increasing dimension the inverse Hessian 
for a Newton-type algorithm becomes computationally 
too costly to be still calculated exactly as one may eas- 
ily afford to do in low dimensions. Rather, it is highly 
advantageous to approximate the inverse Hessian and 
the gradient iteratively by making use of previous runs 
within the same inner loop (see flow diagram to Fig. [TJ 
Fig. [2]). This captures the spirit of the well-established 
limited-memory Bro yde n-Fletcher-Goldfarb-Shanno (l- 
BFGS) approach [H, lH, [ill . The pros of l-bfgs, how- 
ever, are rather incompatible with restricting the number 
of inner loops to Smax = 1 as is often done in sequential 
approaches. Therefore in turn, gradient modules scaling 
favourably with problem dimension may ask for matched 
update schemes. 

(d) It is a common misconception to extrapolate from 
very few iterations needed for convergence in low dimen- 
sions that the same algorithmic setting will also perform 
best in high dimensional problems. Actually, effective 
CPU time and number of iterations needed for conver- 
gence are far from being one-to-one. — The same fea- 
ture may be illustrated by recent results in the entirely 
different field of tensor approximation, where again in 
low dimensions, exact Newton methods outperform any 
other by number of iteration as well as by CPU time, while 
in higher dimensions, exact Newton steps cannot be cal- 
culated at all (see Figs. 11.2 through 11.4 in Ref. 

It is for these reasons we discuss the key steps of the 
algorithmic framework in terms of their constituent mod- 
ules. 



1. Gradient-Based Update Modules 

Here we describe the second-order and first-order 
control-update modules used by the respective algo- 
rithms. 

Second-Order (Quasi) Newton Methods: The array of 
piecewise constant control amplitudes (in the r*^ itera- 
tion), {u~p{t^^^) I j = 1, 2, . . . , m and fc = 1, 2, . . . , M^*)} 
are concatenated to a control vector written lu^*"^) for 
convenience (in slight abuse of notation). Thus the stan- 
dard Newton update takes the form 

^ +a,H-i|grad/M). (19) 
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Here is again a step size and 'H~^ denotes the inverse 
Hessian, where jgrad/^'"-') is the gradient vector. For 
brevity we also introduce shorthands for the respective 
differences of control vectors and gradient vectors 



\xr) := kt^'-+i) 



M^'')) and 

yr) :=|grad/(^+i))-|grad/M) 



Now in the Broyden-Fletcher-Goldfarb-Shanno standard 
algorithmic scheme referred to as bfgs [E^I, the inverse 
Hessian is conveniently approximated by making use of 
previous iterations via 



n;l^ = V*n;^Vr + Trr\Xr){Xr\ 
with the definitions 



(20) 



and Vr '■= t— ■Kr\yr){Xr 



By its recursive construction, (i) BFGS introduces time 
non-local information into the optimisation procedure 
as soon as the inverse Hessian has off-diagonal compo- 
nents and (ii) bfgs perfectly matches concurrent updates 
within the inner loop: using second-order information 
makes up for its high initialisation costs by iterating over 
the same subspace of controls throughout the optimisa- 
tion. Note that the MATLAB routine fminunc uses the 
standard BFGS scheme, while the routine fmincon uses the 
standard limited- memory variant l-bfgs (sil. [65l. [66l. [68| . 
Another advantage of the BFGS scheme is that the ap- 
proximate Hessian is by construction positive-definite, 
allowing for straightforward Newton updates. 

In contrast, for sequential updates, BFGS is obviously 
far from being the method of choice, because sequen- 
tial updates iterate over a changing subset of controls. 
In principle, direct calculation of the Hessian is possi- 
ble. However, this is relatively expensive and the local 
Hessian is not guaranteed to be positive definite, neces- 
sitating the need for more complex trust-region Newton 
updates. A detailed analysis of optimal strategies for se- 
quential update methods is necessary and is presented in 
|69| . Preliminary numerical data (see Sec. lIVDjl suggest 
that the gain from such higher-order methods for sequen- 
tial update schemes is limited and not sufficient to offset 
the increased computational costs per iteration in gen- 
eral. Thus we shall restrict ourselves here to sequential 
updates based on first-order gradient information. 

First-Order Gradient Ascent: The simplest case of a 
gradient-based sequential-update algorithm amounts to 
steepest-ascent in the control vector, whose elements fol- 
low 



a. 



Qf{r 



(21) 



where is an appropriate step size or search length. For 
gate optimization problems of the type considered here 
it can be shown that sequential gradient update with 



suitable step-size control can match the performance of 
higher order methods such as sequential Newton updates 
while avoiding the computational overhead of the lat- 
ter [6§|. Although choosing a small constant ensures 
convergence (to a critical point of the target function) 
this is usually a bad choice. We can achieve much better 
performance with a simple heuristic based on a quadratic 
model ar(2 — a^) of / along the gradient direction in the 
step-size parameter a^. Our step-size control is based 
on trying to ensure that the actual gain in the fidelity 
A/ = f{ar) — /(O) is at least 2/3 of the maximum gain 
achievable based on the current quadratic model. Thus, 
we start with an initial guess for a^, evaluate Af{ar) and 
use the quadratic model to estimate the optimal step size 
a* (r) . If the current is less than 2 /3 of the optimum 
step size then we increase by a small factor, e.g., 1.01; 
if ar is greater than 4/3 of the estimated optimal Q;*(r) 
then we decrease by a small factor, e.g., 0.99. Instead 
of applying the change in immediately, i.e., for the 
current time step, which would require re-evaluating the 
fidelity, we apply it only in the next time step to give 



1.01 a,, for a,. < |Q!,(r) 
ctr+i = { 0.99 a,- for a,- > ^a^{r) . (22) 



ar 



else 



For sequential update with many time steps, avoiding the 
computational overhead of multiple fidelity evaluations is 
usually preferable compared to the small gain achieved by 
continually adjusting the step size ar at the current time 
step. This deferred application of the step size change is 
justified in our case as for unitary gate optimization prob- 
lems of the type considered here, as usually quickly 
converges to an optimal (problem-specific) value and only 
varies very little after this initial adjustment period, re- 
gardless of the initial ar (69| . 

As has been mentioned above, this step-size control 
scheme for sequential update comes close to a direct 
implementation of trust-region Newton (see Fig. El in 
Sec. IIVD|) . a detailed analysis of which is given in |69j . 



2. Gradient Modules 

Exact Gradients: In the module used for most of the 
subsequent comparisons, exact gradients to the exponen- 
tial maps of total Hamiltonians with piecewise constant 
control amplitudes over the time interval At are to be 
evaluated. Here we use exact gradients as known from 
various applications [3l|, [t^]. Their foundations were 
elaborated i n [tH.!?^ , so here we give a brief sketch along 
the lines of IlO, El (leaving more involved scenarios be- 
yond piecewise constant controls to be dwelled upon else- 
where). For 

X cxp{-iAti7„} = cx.p{-iAt{Hd + ^UjHj)} (23) 
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the derivative invokes the spectral theorem to take the 
form 



if A, = A 



-iAtX, ^ — iAt>,„ 



-^At{Xl\H,\X^) ^ ifAz^ 



(24) 



where in the second identity we have dehberately kept 
the factor —iAt for clarity. Thus the derivative is given 
elementwise in the orthonornial eigenbasis {[A;)} to the 
real eigenvalues {Xi} of the Hamiltonian Hu- Details are 
straightforward, yet lengthy, and are thus relegated to 
Appendix A. 

Approximate Gradients: In Ref. we took an ap- 
proximation valid as long as the respective digitisation 
time slices are small enough in the sense At ^ l/||i7„||2 
with Hu as in Eqn. 



(25) 



This approximation can be envisaged as replacing the 
average value brought about by the time integral over the 
duration At = tk — tk-i, which in the above eigenbasis 
takes the form 

= -i j dTe''^^^'>'-^\Xi\H,\Xm)e-'^-^--'---^ 
tk-i 



-iAt {Xl\Hj\X,n, 



-iXmAt 



(26) 



by the value of the integrand at the right-hand side of 
the time interval r g [tk-i,tk\. Clearly, this approxi- 
mation ceases to be exact as soon as the time evolution 
U{tk,tk-i) = e~*^*^" fails to commute with Hj. Generi- 
cally this is the case and the error scales with |A; — A„i|At. 

Finite Differences provide another standard alterna- 
tive, which may be favourable particularly in the case of 
pure-state transfer, see (73| . 



3. Exponentiation Module 

Matrix exponentials are a notorious problem in com- 
puter science fz^ . [75| . Generically, the standard mat- 
lab module takes the matrix exponential via the Pade- 
approximation, while in special cases (like the Hermitian 
one pursued throughout this paper) the eigendecomposi- 
tion is used (95| . 

From evaluating exact gradients (see above) the eigen- 
decomposition of the Hamiltonian is already available. 



Though in itself the eigendecomposition typically comes 
at slightly higher computational overhead than the Pade 
matrix exponential, this additional computational cost is 
outweighed by the advantage that evaluating the matrix 
exponential now becomes trivial by exponentiation of the 
eigenvalues and a matrix multiplication. 

Thus as long as the eigendecompositions are available, 
the matrix exponentials essentially come for free. Since 
in the sequential-update algorithm, the gradient needed 
for the exponential in time slice k requires an update in 
time slice fc — 1, the exponential occurs in the inner loop 
of the algorithm, while obviously the concurrent-update 
algorithm takes its exponentials only in the outer loops. 
The total number of exponentials required by the two 
algorithms are basically the same. 



4- Reducing the Number of Matrix Operations 

As described above, the search for an optimal control 
sequence proceeds on two levels: an outer loop choosing 
the time slices to be updated (a decision which may imply 
choice of gradient-based step method, as well as other 
control parameters), and an inner loop which computes 
gradients and advances the search point. With dynamo, 
significant effort has been made to optimise the overall 
number of matrix operations. 

For a general hybrid scheme, where T'''' is a sub- 
set of time slices {^i*''- • ■^^'^''(,) } approach is as follows: 
Given time slices Xi,...,A'a/, of which in hybrid up- 
date schemes we select for updating any general set 
Xt-^, ■ . ■ ,Xt^, we can collapse multiple consecutive non- 
updating X into a single effective Y . For example, 
consider Xi,...,A"io of which we update X2,X5 and 
Xg. Before proceeding with the inner loop, we gener- 
ate concatenated products Yi , . . . , 14 such that Yi = Xi , 
Y2 = and XioXgATgXv. Now the heart of 

the expression to optimise for is Y^XqX^Y2X2Yi. 

As a result, computation of forward and backward 
propagators can be done with the minimal number of ma- 
trix multiplications. Matrix exponentiation is also min- 
imised by way of caching and making use of the fact that 
for some gradient computation schemes eigendecomposi- 
tion is required, thus allowing for light-weight exponen- 
tiation. 

Moreover, the dynamo platforms isolates the prob- 
lem of minimising matrix operations to a specific module, 
which is aware of which i?u-s, X-s and A-s are needed for 
the next step, compares these with the time slices which 
have been updated, and attempts to provide the needed 
data with the minimal number of operations. And while 
for some hybrid update schemes the current number of 
operations performed in the outer loop is not strictly 
optimal in all cases, optimality is reached for Krotov, 
grape and schemes which update consecutive blocks of 
time slices. 



9 



5. Modularisation Approach in dynamo 

To allow for flexibility in design and implementation of 
new optimal control techniques, the framework is mod- 
ularised by way of function pointers, allowing, e.g., the 
second-order search method to receive a pointer to a func- 
tion which calculates the gradient, which in-turn may 
receive a pointer to a function which calculates the expo- 
nential. The cross-over algorithm described Fig. 21 e.g., is 
implemented by a search method receiving as input two 
search-method modules and a cross-over condition, which 
is used termination condition for the first search 

method. The first-order hybrids described in Fig. |8] are 
similarly implemented by a block-wise subspace selection 
function (generalisation of the sequential versus concur- 
rent selection schemes) receiving a pointer to the search 
function to be used within each block. DYNAMO is pro- 
vided with many such examples. 

If one is exploring, e.g., second-order search methods 
appropriate for serial update schemes, one only needs 
to write the update-rule function, dynamo will provide 
both the high-level subspace-selection logic and the low- 
level book-keeping that is entrusted with tracking which 
controls have been updated. When given a demand for 
gradients, propagators or the value function, it performs 
the needed calculations while minimising the number of 
matrix operations. Moreover, once a new algorithm is 
found, DYNAMO makes it easy both to compare its per- 
formance to that of the many schemes already provided 
as examples and to do so for a wide set of problems de- 
scribed in this paper. Thus DYNAMO serves as a valuable 
benchmarking tool for current and future algorithms. 



III. STANDARD SCENARIOS FOR QUANTUM 
APPLICATIONS 

We have discussed the versatile features of the frame- 
work embracing all standard scenarios of bilinear quan- 
tum control problems listed in Tab. HI Here we give the 
(few!) necessary adaptations for applying our algorithms 
to such a broad variety of paradigmatic applications, 
while our test suite is confined to unitary gate synthesis 
and cluster-state preparation in closed quantum systems. 



A. Closed Quantum Systems 

The most frequent standard tasks for optimal control 
of closed systems comprise different ways of gate synthe- 
sis as well as state transfer of pure or non-pure quantum 
states. More precisely, sorted for convenient development 
from the general case, they amout to 

Task 1: unitary gate synthesis up to a global phase. 
Task 2: unitary gate synthesis with fixed global phase. 
Task 3: state transfer among pure-state vectors. 
Task 4: state transfer among density operators. 



As will be shown, all of them can be treated by common 
propagators that are of the form 

Xk = cxp{-iAt Hu{tk)} 

= exp{-^A^(i^<^ + ^u,(^fc)i^,)} . (27) 

j 

Algorithmically, this is very convenient, because then the 
specifics of the problem just enter via the boundary con- 
ditions as given in Tab. IIIII clearly, the data type of the 
state evolving in time via the propagators Xk is induced 
by the initial state being a vector or a matrix represented 
in Hilbert space or (formally) in Liouville space. 

Indeed for seeing interrelations, it is helpful to for- 
mally consider some problems in Liouville space, before 
breaking them down to a Hilbert-space representation for 
all practical purposes, which is obviously feasible in any 
closed system. 

Task 1 projective phase-independent gate synthesis: 
In Tab. lIIII the target projective gate C/targct can be taken 
in the phase-independent superoperator representation 
X := X (E) X to transform the quality function 

/psc/ = l^Rctr{C/4g,,X(r)} 

= ^ Re tr{(C/*„g„tXT) ® (C/Lget^T)} 

^^ItriuLXrW so 

fpsu = i\tT{uLXT}\ = ^\tr {Al^,.,^,Xk..o}\ , 

(28) 

where the last identity recalls the forward and back- 
ward propagations X{tk) XkXk~i ■ ■ ■ X2X1XQ and 

A^(t/c) := uI^^^^^XmXm-i ■ ■ ■ Xk+2Xk+i. 

So with the overlap g 'k^^^^\i+i-k+i^k:o\ of 
Eqn. pT|) . the derivative of the squared fidelity with re- 
spect to the control amplitude Uj[tk) becomes 

= ^Re ir{g*KU,,kA^)Xk-,,} , 

(29) 

where is given by Eqn. ((M)) . The term g* arises via 
P{u) = \g{u)\\ so that by = 2 \g{u)\{^Jg{u)\) one 

gets (for \g{u)\ ^ 0) f| = £\g{u)\ = to arrive 

at 

'^"'tf'"'' - i Re tr{e-'^^AU,,^,(M^)X._,o} , 

(30) 

where e~"'^s := g* /\g\ uses the polar form g = |g|e+*'^9 
for a numerically favourable formulation. 

Thus, in closed systems, the superoperator represen- 
tation is never used in the algorithm explicitly, yet it is 
instructive to apply upon derivation, because Task 2 now 
follows immediately. 
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Table III: Boundary Conditions for Standard Scenarios 



Conditions 


Initial 


Final 








closed systems: 






pure-state transfer 




I'i/') target 


gate synthesis I 


In 


arget 


state transfer 


po 


ptargct 


gate synthesis II 




fAargct 


open systems: 






state transfer 


po 


ptargct 


map synthesis 


1n2 


-?*targct 



state of the system: evolution of initial state as 
X{tk) = Xk-Q :— XkXk-i ■ ■ ■ XiXo 
with propagators X^ = e'^^'-'^+'^i "i{*,.)Bj) fo;. ^ i 
and with A, Bj as defined in Tab. |l] 



Task 2 phase- dependent gate synthesis: 

In Tab. Illll the target gate C/targct now directly enters the 

quality function 

fsu = ^ Re tr {uI,Xt} = ^ Re tr {Aij+,.,k+iXk:o} ■ 

(31) 

So the derivative of the fidelity with respect to the control 
amplitude Uj{tk) with reference to of Eqn. (|24|) reads 



It is in entire analogy to Eqn. ([201) • 



(32) 



Actually, this problem can be envisaged as the lifted 
operator version of the pure-state transfer in the subse- 
quent Task 3, which again thus follows immedialtely as 
a special case. 

Task 3 transfer between pure-state vectors: 

Target state and propagated initial state from Tab. IIIIl 

|'/')targct, X{T)\i'o) form the scalar product in the quality 

function 

/ = ^ Re (Vtargctl^T) = ^ Rc [tr] { A;,^,.,^,Xfe:0 } , 

(33) 

where the latter identity treats the propagated column 
vector Xk:i\Xo) as iV x 1 matrix Xk-o and likewise the 
back-propagated final state ('0tar|(ArM:fc-i-i)^ as 1 x ma- 
trix so the trace can be ommited. Hence the 
derivative of the fidelity with respect to the control am- 
plitude Ujitk) remains 



'-I^^ = ^Ro[tr]{AU,,,,(||f)X,_,o} 
with 1^ of Eqn. 



(34) 



Task 4 state transfer between density operators: 

The quality function normalised with respect to the 

(squared) norm of the target state c := Uptarlli reads 



/= iRc tr{xl,+iAdx,(Xo)} 
= i Re tY{xl,^^XTXoX},} 
= i Re iY{Xj^j^-^XMXM-i ■ ■ ■ Xk - ■ ■ X2X1XQ': 

X x\xl ■■■xl--- xlj_^xl,^} . 



(35) 



Hence the derivative of the quality function with re- 
spect to the control amplitude Uj{tk) takes the somewhat 
lengthy form 



M^.iRe(tr{xU,^ 
X Xlxl ■ 



M 



■Xl 



.ax.) 

^ OUj / 

■xli} 



■ A^2^i A^o X 



t'-'i^'^M+l^Af • • • Xfc • • • X2XiXq^. 

x44-- ••4.}) 



(36) 



where the exact gradient again follows Eqn. 

Notice that Task 1 can be envisaged as the lifted op- 
erator analogue to Task 4 if phase independent projec- 
tive representations \i^i,){t]-J,j\ of pure states are to 
be transferred. 



B. Open Quantum Systems 

Task 5 quantum map synthesis in Markovian systems: 
The superoperator Hu{tk) to the Hamiltonian above can 
readily be augmented by the relaxation operator T. Thus 
one obtains the generator to the quantum map 

Xk = cxp{-At (iHuitk) + r{tk))} (37) 

following the Markovian equation of motion 

X{t) = -{iH., + T) X{t) . (38) 

By the (super)operators X{tk) :~ XkXk-i ■ ■ ■ XiX^ 



and A^tk) 



t^M ■ Xm^i ■ ■ ■ Xk+2Xk+i the 



derivative of the trace fidelity at fixed final time T 
/ = ^Re tT{FL^,,X{T)} =^Rc tr{At(t,)X(i,.)} 



with respect to the control amplitude Ujitj^) formally 
reads 



Of 



i,Rc tr{At(t,)(^)X(tfc„i)} (39) 



du,{tk} N 



Since in general T and iHu do not commute, the semi- 
group generator (t_ff„ + F) is not normal, so taking the 
exact gradient as in Eqn. (|24p via the spectral decomposi- 
tion has to be replaced by other methods. There are two 
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convenient alternatives, (i) approximating the gradient 
for sufficiently small At < l/\\iHu + ^2 by 



-At iH„ 



duj(tk) 

or (ii) via finite differences. 



(40) 



This standard task devised for Markovian systems |27| 
can readily be adapted to address also non- Markovian 
systems, provided the latter can be embedded into a (nu- 
merically manageable) larger system that in turn inter- 
acts with its environment in a Markovian way (28| . 

Task 6 state transfer in open Markovian systems: 
This problem can readily be solved as a special case of 
Task 5 when envisaged as the vector version of it. 

To this end it is convenient to resort to the so-called 
uec-notation [t^ of a matrix M as the column vector 
vec(M) collecting all columns of M . Now, identifying 

Xf) := vec(po) and ^t^argct ■= '^6C*(pJargct) one obtains the 
propagated initial state X{tk) := X^Xk-i ■ ■ ■ XiXo and 
A'f(tfc) := xI^^^^^XmXm-i ■ ■ ■ Xk+2Xk+i as back propa- 
gated target state. In analogy to Task 3, they take the 
form of N'^ X 1 and 1 x A^^ vectors, respectively. Thus 
the derivative of the trace fidelity at fixed final time T 

/ = i Re [tr]{xl,^,,X{T)} = i Re [tr]{At(ifc)X(tfe)} 

with respect to the control amplitude Uj(tk) reads 

4^ = iRe[tr]{At(t,)(^)X(t,_i)} , (41) 



where for 



g^ ^i^-^ the same gradient approximations apply 
as in Task 5. 

For the sake of completeness, Appendix C gives all the 
key steps of the standard Tasks 1 through 6 in a nutshell. 



IV. RESULTS ON UPDATE SCHEMES: 
CONCURRENT AND SEQUENTIAL 

A. Specification of Test Cases 

We studied the 23 systems listed in Tab. IIVI as test 
cases for our optimisation algorithms. This test suite 
includes spin chains, a cluster state system whose effec- 
tive Hamiltonian represents a C4 graph, an NV-centre 
system and two driven spin-j systems with j = 3, 6. At- 
tempting to cover many systems of practical importance 
(spin chains, cluster-state preparation, NV-centres) with 
a range of coupling topologies and control schemes, the 
study includes large sets of parameters like system size, 
final time, number of time slices, and target gates. We 
therefore anticipate our suite of test cases will provide 
good guidelines for choosing an appropriate algorithm in 
many practical cases. 



1. Spin Chains with Individual Local Controls 

Explorative problems 1-12 are Ising-ZZ spin chains 
of various length in which the spins are addressable by 
individual x- and y-controls. The Hamiltonians for these 
systems take the following form; 



n-l 
J \ ^ „zz 

k=l 

1 

2 ''j 



(42) 
(43) 



where J=:l,n = l,...,5 and j = I, 



111 example 1 we also consider linear crosstalk (e.g., via 
off-resonant excitation), leading to the control Hamilto- 



Hl,2 = Ci\.2<^\ + a2,lO'2 
i?3.4-/32.1fTi^+/3l,2CT! 



(44) 
(45) 



where are independent control fields and and 
are crosstalk coefficients. We chose ai = /32 = 1 and 
a2 = /3i = 0.1. 



2. Cluster State Preparation in Completely Coupled Spin 
Networks 



The effective Hamiltonian of test problems 13 and 



14, 



Hcs = iK^2 + ^2^1 + ^3^1 + ' (46) 



represents a C4 graph of Ising-ZZ coupled qubits which 
can be used for cluster state preparation according 
to ■ The underlying physical system is a completely 
Ising-coupled set of 4 ions that each represents a locally 
addressable qubit: 



3 4 

k=l l=k+l 



1 

2 ''3 



(.? = !,• ••,4). 



(47) 
(48) 



Again, the coupling constant J was set to 1. The follow- 
ing unitary was chosen as a target gate, which applied 
to the state = ((|0) + \1))/V2)'^^ generates a cluster 
state 

Ug = exp{-i^Hcs) ■ (49) 



3. NV-Centre in Isotopically Engineered Diamond 

In test problems 15 and 16 we optimised for a CNOT 
gate on two strongly coupled nuclear spins at an nitrogen- 
vacancy (NV) centre in diamond as described in (78| . 
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In the eigenbasis of the coupled system, after a trans- 
formation into the rotating frame, the Hamiltonians are 
of the form 

Hd = diag(i;i,£;2,£;3,-B4)+cj,diag(l, 0,0,-1) (50) 

Hi = ^ {fJ-12Cri2 + M130-13 + M240-24 + M34Cr3_4) (51) 
H-I = 5(^12^^12 + M130-13 + M240-I4 + ^34(^14) • (52) 

Here Ei . . . E4 are the energy levels, uj,, is the carrier 
frequency of the driving field and iia,p is the relative 
dipole moment of the transition between levels a and 
/3. We chose the following values for our optimisations: 
{Ei,E2,E3,E4} = 27r{-134.825, -4.725, 4.275, 135.275} 
MHz, = 27r X 135 MHz, {^12, Mi3ii^4, ^^34} = 
{1, 1/3.5, 1/1.4, 1/1.8} in accordance with [TSj . 

4- Special Applications of Spin Chains 

Test problems 17 and 18 are modified five-qubit Ising 
chains extended by a local Stark-shift term being added 
in the drift Hamiltonian Hd resembling a gradient. The 
control consists of simultaneous x- and y-rotations on all 
spins 

/ 4 

Hd^-Y.a^a!^,^ii + 2)at (53) 

5 5 

^i = ^E< and i^2 = i^ar. (54) 

i=l i=\ 

Problem 19 is a Heisenberg-XXX coupled chain of 
five spins extended by global permanent fields inducing 
simultaneous x-rotations on all spins: 

4 

^-|E<^?+i+^''^m+<^m-10<- (55) 
1=1 

Control is exerted by switchable local Stark shift terms, 

H.^ol (i = l,...,5). (56) 

Spin chains may be put to good use as quantum wires 
[43I . pTOl - is^ . The idea is to control just the input end 
of the chain using the remainder to passively transfer 
this input to the other end of the chain. To embrace 
such applications, in problems 20 and 21, the spins 
are coupled by an isotropic Heisenberg-XXX interaction 
and the chains are subject to x- and y-controls only at 
one end (at one or two spins, respectively): 

n-l 
i=l 

li^.2^\al'^ and {B^^^^]^a^^^) (58) 

Here J = 1 and n = 3, 4. Restricting the controls in this 
way makes the systems harder to steer and thus raises 
the bar for numerical optimisation. 



5. Spin-3 and Spin-6 Systems 

As an example beyond spin- 1/2 systems, in test prob- 
lems 22 and 23 we consider a Hamiltonian of the fol- 
lowing form [S^ 

Hu = Jl +uiJ-,+ U2Jx, (59) 

where the Ji are angular momentum operators in spin-j 
representation. The term represents the drift Hamil- 
tonian and the other two terms function as controls. We 
chose j = 6 for problem 22 and j = 3 for problem 23. 



B. Test Details 

As shown in Tab. II VI we optimised each test system for 
one of four quantum gates: a CNOT, a quantum Fourier 
transformation, a random unitary, or a unitary for cluster 
state preparation according to section IIV A21 Random 
unitary gates generated according to the Haar measure 
[s^ are meant to be numerically more demanding than 
the other gates. The final times T were always chosen 
sufficiently long to ensure the respective problem is solv- 
able with full fidelity (hence the times should not be mis- 
taken as underlying time-optimal solutions). All results 
were averaged over 20 runs with different initial pulse 
sequences (control vectors) , i.e. randomly generated vec- 
tors with a mean value of mean{uini) = and a standard 
deviation of std[uini) = 1 in units of 1/J unless speci- 
fied otherwise (as in Tab. IVIl where std{uini) = 10 to 
study the influence of the initial conditions). The max- 
imum number of loops was set to 3000 for the concur- 
rent update scheme and to 300000 for the sequential up- 
date. All systems were optimised with a target fidelity 
of /target = 1 — 10~^. As an additional stopping criterion 
the change of the function value from one iteration to the 
next (concurrent update) or between the last iteration 
and the average of the previous M iterations (sequential 
update) was introduced. The threshold value in this case 
was set to 10~^. For the concurrent update algorithm, 
the optimisation stopped when the smallest change in 
the control vector was below 10~®. We measured the wall 
times of our optimisations to give a measure for the actual 
running time form start to completion (including, e.g., 
memory loads and communication processes) instead of 
only measuring the time spent on the CPU. The optimi- 
sations were carried out under matlab R2009b (64bit, 
single-thread mode) on an AMD Opteron dual-core CPU 
at 2.6 GHZ with 8 GB of ram. (The dynamo hybrids ran 
later with an extension to 32 GB of ram under features 
of matlab R2010b). The wall time was measured us- 
ing the tic and toe commands in MATLAB. Pure Krotov 
vs GRAPE comparisons (Tabs. IVl through IVII[) were car- 
ried out on separate optimised MATLAB implementations 
thus avoiding any overhead (e.g., loops and checks) re- 
quired for more flexibility in DYNAMO, where the hybrids 
(Figs. [51 were run. 
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Figure 3: (Colour) Optimisation results for problems 12, 14, 17, and 22 shown in doubly logarithmic plots; each optimisation 
is run with twenty random initial conditions; the trace of mean values is given in boldface. The blue (concurrent) and red 
(sequential) lines depict the deviation of the quality from the maximum of 1 as a function of the wall time. Each line represents 
one optimisation. The insets show the initial behaviours and crossing points in a log-linear scale. For the sequential-update 
algorithm in problem 22 (last panel), the thresholds for the change in the control and function values have to be lowered (to 
10"^" instead of the standard 10~* [see test conditions]) for reaching qualities comparable to the ones the concurrent scheme 
arrives at under standard conditions. (Note the altered thresholds apply as well to the data listed in Tab. rV|l . 



C. Test Results and Discussion 

From the full set of data presented in Tab. |Vl Fig. [3] 
selects a number of representatives for further illustra- 
tion. Note the following results: First, in most of the 
problems, sequential and concurrent-update algorithms 
reach similar final fidelities, the target set to 1 — 10~^ 
being in the order of a conservative estimate for the 
error-correction threshold [sHi- Out of the total of 23 
test problems, this target is met within the limits of it- 
erations specified above except in problems 5, 7, 10, 12 
and 13. Only in problem 23 the sequential-update al- 
gorithm yields average residual errors (1-fidelity) up to 
two orders of magnitude higher than in the concurrent 
optimisation. Remarkably enough, the average running 
times differ substantially in most of the test problems, 
with the concurrent-update algorithm being faster. Only 



in problems 3, 4, 15 and 16 the final wall times are simi- 
lar. Note that in all but the very easy problems 3, 4, and 
16, the sequential algorithm needs a larger total number 
of matrix multiplications and eigendecompositions. In 
particular, due to the slower convergence near the crit- 
ical points, the sequential-update scheme requires more 
iterations in order to reach the target fidelity of 1 — 10~^ 
thus resulting in a greater number of matrix multiplica- 
tions and eigendecompositions. 

In many problems (3, 5, 6, 8, 9, 11, 12, 14, 18, 19, 21, 
and 22), we observe a crossing point in time course of 
the fidelity of the two algorithms. The sequential-update 
algorithm is overtaken by the concurrent-update scheme 
between a quality of 0.9 and 0.99 (see, e.g., Problem 21 
in Fig. [3]). Therefore, exploiting the modular framework 
of the programming package to dynamically change from 
a sequential to a concurrent-update scheme at a medium 
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Problem 21 
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Figure 4: (Colour) Example of a handover (green) from a 
sequential- (red) to a concurrent-update (blue) scheme. The 
sequential algorithm is run up to a handover quality of 0.93, 
where the resulting pulse sequence is then used as input to 
the concurrent algorithm for optimisation up to the target 
quality. This type of handover is supported by the modular 
structure of dynamo. 



Problem 8 
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Figure 6: (Colour) Comparison of sequential-update methods 
with first-order gradient information (red track) and with a 
direct implementation of a trust-region Newton method (blue 
dotted track) showing that per iteration the gains are similar, 
in particular in the long run. The curves represent averages 
over 100 trajectories with random initial conditions. 
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Figure 5: (Colour) Comparison of unconstrained and (loosly) 
constrained optimisations. The concurrent-update algorithm 
uses the standard MATLAB-toolbox functions fminunc and 
fmincon with the latter being slower than the former, as 
it may switch between different internal routines. The 
sequential-update algorithm uses a very basic cut-off method 
for respecting the constraints, which shows little effect on the 
performance. 



fidelity can be advantageous. This is exemplified in the 
(constrained) optimisation shown in Fig. |4j here the se- 
quential method is typically faster at the beginning of the 
optimisation, whereas the concurrent method overtakes 
at higher fidelities near the end of the optimisation. — 
Moreover with regard to dispersion of the final wall times 
required to achieve the target fidelity, in problems 5, 7, 
10, 11, 12, 13 and 23 the sequential-update algorithm 
shows a larger standard deviation thus indicating higher 



sensitivity to the initial controls. 

Also on a more general scale, we emphasise that the 
run-times may strongly depend on the choice of ini- 
tial conditions. Results for larger initial pulse ampli- 
tudes with a higher standard deviation can be found in 
Tab. IVII Increasing mean value and standard deviation 
of the initial random control-amplitude vectors typically 
translates into longer run times. This effect is more pro- 
nounced for sequential than for concurrent-update algo- 
rithms. Consequently, the performance differences be- 
tween the two algorithms may increase and crossing or 
handover points may change as well. 

Finally, as shown in Fig. [51 the performance of 
the concurrent-update scheme also differs between con- 
strained and unconstrained optimisation, i.e. between the 
standard MATLAB subroutines fmincon and fminunc (see 
MATLAB documentation). In contrast, the sequential- 
update algorithm uses the same set of routines for both 
types of optimisations, where a basic cut-off method for 
respecting the constraints has almost no effect, as also 
illustrated by Fig. [S] 



D. Preliminaries on Trust-Region Newton 
Methods for Sequential-Update Algorithms 

Fig. [6] shows that the sequential-update method with 
first-order gradient information used in this work already 
achieves a quality gain per iteration that comes closest to 
the one obtained by a direct implementation of a trust- 
region Newton method. However, as is analysed in detail 
on a larger scale in (69| . the small initial advantage per 
iteration of latter against the former is outweighed CPU- 
timewise by more costly calculations, which is why we 
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Problem 6 (3 spins, T = 7, M = 140) 
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Figure 7: (Colour) Comparison of four different methods for 
computing gradients in 20 unitary optimisations of problem 6. 
Apart from the standard approximation, all methods compute 
exact gradients. By making use of the spectral decomposition, 
diagonalisingthe total Hamiltonian to give exact parameter 
derivatives |7Ch72 | is the fastest among these methods, be- 
cause by the eigen-decomposition the matrix exponential can 
be settled as well (i.e. in the same go). In case of optimis- 
ing controls for (pure) state-to-state transfer, the standard 
approximation can be shown to be competitive. 
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have used the first-order gradients for comparison. 



E. Comparing Gradient Methods 

We compare the performance of four different methods 
to compute gradients for the concurrent algorithm: in 
addition to the standard approximation and the exact 
procedure described in section UlE 1[ we follow Ref. f86| 
and study a Taylor series to compute the exponential 
and a Hausdorff series to compute the gradient, while 
the fourth method is standard finite-differences. Note 
that Hausdorff series and finite differences can be taken 
to a numerical precision exceeding that of the standard 
approximation. 

An example of the performance results found for these 
four methods is given in Fig. [71 where we optimise con- 
trols for a QFT on the 4-spin system of problem 6. Uni- 
tary optimisations on other systems yield similar results, 
with the diagonalisation being the fastest methods in all 
cases. For state-to-state transfer (pure states), however, 
the standard approximation performs well enough as to 
be competitive with exact gradients by diagonalisation. 
Note that for unitary gate synthesis of generic gates, one 
cannot use sparse-matrix techniques, for which the Haus- 
dorff series is expected to work much faster as demon- 
strated in the software package spinach [s^I- 



Figure 8: (Colour) Performance of generalising the first- 
order- gradient sequential scheme to updating blocks of joint 
time slices and allowing for multiple iteration steps within 
each block {sumu > 1), as applied to (a) test problem 2 and 
(b) 21 (see Tab.|fV]and Sec lIVAll) . Original Krotov modifies 
one time slice in a single iteration (s = 1) before moving to 
the subsequent time slice to be updated: this special case is 
shown in the lower corner of the plot, while the upper right is 
the first-order variant of grape (in a suboptimal setting, since 
the step-size handling is taken over from the one optimised for 
Krotov). Wall times represent the average over 42 runs with 
random initial control vectors (again with mean{ui„i) — 
and std{ui„i) = 1 in units of 1/J); times are cut off at 60 
(resp. 300) sec. Note that in problem 2 the hybrid first-order 
versions are not faster than the original Krotov, while in prob- 
lem 21 it pays to concurrently update four or five time slots by 
a single step before moving on to the next set of time slots. — 
Note that in other cases also the first-order concurrent update 
can be fastest, see Fig. 1101 



F. Hybrid Schemes 

Using DYNAMO, we have just begun to explore the mul- 
titude of possible hybrid schemes. Here we present first- 
order (Fig. [5]) as well as second-order (Fig. ^ schemes, 
where the hybrids are taken with respect to sequen- 
tial versus concurrent subspace-selection. More precisely, 
this amounts to an outer-loop subspace-selection scheme 
which picks consecutive blocks of n time slices to be 
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updated in the inner-loop using either a first-order or 
a second-order-method update scheme each allowing to 
take at most sumit steps within each block. The results 
of these explorations, as applied to the two-spin case of 
problem 2 and problem 21 (see Tab. |TV]and Sec. IIV A II 
with the same initial conditions as in Tab. |V|, are de- 
picted in Fig. [8] for first-order gradient update and in 
Fig. [H] for second-order BFGS update. They provide illu- 
minating guidelines for further investigation, as the Kro- 
tov method taking a single timeslice (n = 1) sequentially 
after the other for a single update step (siimit = 1) may 
not always be the best-performing use of the first-order 
update scheme. 

On the other hand, in second-order BFGS methods the 
GRAPE scheme with totally concurrent update cannot be 
accelerated by allowing for smaller blocks of concurrent 
update in the sense of a 'compromise towards Krotov'; 
rather it is an optimum within a broader array of simi- 
larly performing schemes. This is remarkable, while the 
incompatibility of BFGS with sequential update rules is 
to be expected on the grounds of the discussion above. 

Further explorative numerical results on first-order hy- 
brids between sequential and concurrent update as com- 
pared to the second-order concurrent update can be 
found as Fig.fTOlin the appendix. They show that in sim- 
pler problems first-order sequential update (as in Krotov) 
is faster than the (highly suboptimal) first- order variants 
of hybrid or concurrent update, while in more compli- 
cated problems already the first-order variant of concur- 
rent update is slightly faster. In any case, all first-order 
methods are finally outperformed by second-order con- 
current update (as in GRAPE-BFGS). 

Clearly, these explorative results are by no means the 
last word on the subject. Rather they are meant to invite 
further studies over a wider selection of problems. But 
even at this early stage we can state that there are hints 
that hybrid methods hold a yet untapped potential, and 
follow-up work is warranted. 



V. CONCLUSIONS AND OUTLOOK 

We have provided a unifiying modular programming 
framework, DYNAMO, for numerically addressing bilinear 
quantum control systems. It allows for benchmarking, 
comparing, and optimising numerical algorithms that 
constructively approximate optimal quantum controls. 
Drawing from the modular structure, we have compared 
the performance of gradient-based algorithms with se- 
quential update of the time slices in the control vector 
(Krotov-type) versus algorithms with concurrent update 
(GRAPE-type) with focus on synthesising unitary quan- 
tum gates with high fidelity. — For computing gradi- 
ents, exact methods using the eigendecomposition have 
on average proven superior to gradient approximations 
by finite differences, series expansions, or time averages. 

When it comes to implementing second-order schemes, 
the different construction of sequential update and con- 



(a) 




steps per block 

Figure 9: (Colour) Performance of generalising the second- 
order (bfgs) concurrent scheme to updating blocks of joint 
time slices and allowing for fewer iteration steps within each 
block (siimit), as applied to (a) problem 2 and (b) problem 21 
(see Tab. IIVI and Sec. IIV A"T|) . Original grape modifies all 
time slices in each iteration: this special case is shown in the 
lower right corner of the plot, while the upper left corner is 
the crude second-order variant of Krotov (for the sake of com- 
parison here in the unrecommendable setting of bfgs). It is 
part of the obvious no-go area of single iterations (s — 1) 
on a single time slice {n — 1), or just few, shown for com- 
pleteness. Wall times represent the average over 42 runs with 
random initial control vectors again with mean{uini) = and 
stdiuini) = 1 in units of 1/J; times are cut off at 35 (resp. 300) 
sec. 



current update translates into different performance: in 
contrast to the former, recursive concurrent updates 
match particularly well with quasi-Newton methods and 
their iterative approximation of the (inverse) Hessian as 
in standard BFGS implementations. Currently, however, 
there seems to be no standard Newton-type second-order 
routine that would match with sequential update in a 
computationally fast and efficient way such as to sig- 
nificantly outperform our implementation of first-order 
methods. Finding such a routine is rather an open re- 
search problem. At this stage, we have employed effi- 
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cient implementations, i.e. first-order gradient ascent for 
sequential update and a second-order concurrent update 
(grape-BFGS). As expected from second-order versus 
first-order methods, at higher fidelities (here typically 
90 — 99%), GRAPE-BFGS Overtakes Krotov. For reaching 
a fidelity in unitary gate synthesis of 1 — 10^**, GRAPE- 
BFGS is faster, in a number of instances even by more 
than one order of magnitude on average. Yet at lower 
qualities the computational speeds are not that different 
and sequential update typically has a (small) advantage. 

By its flexibility, the dynamo framework answers a 
range of needs, reaching from quantum information pro- 
cessing to coherent spectroscopy. For the primary fo- 
cus of this study, namely gate synthesis with high fideli- 
ties beyond the error-correction threshold of some lO"** 
(85| . fidelity requirements significantly differ from pulse- 
engineering for state-transfer, where often for the sake 
of robustness over a broad range of experimental pa- 
rameters, some fidelity (say 5%) may readily be sacri- 
ficed. Thus for optimising robustness, sequential update 
schemes are potentially advantageous, while for gate syn- 
thesis sequential methods can be a good start, but for 
reaching high fidelities, we recommend to change to con- 
current update. More precisely, since dynamo allows for 
efficient handover from one scheme to the other, this is 
our state-of-the-art recommendation. 

On-going and future comparisons are expected to profit 
from this framework, e.g., when trying update modules 
with non- linear conjugate gradients [88|, |89j. 
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update methods are most efficient when matched to 
first-order gradient procedures. — Yet, this issue is 
subject of follow-up work. 

Hybrid Algorithms: We have focused on the two 
extremes of the update scheme spectrum: the sequential 
and the fully concurrent. — Yet, hybrid schemes which 
intelligently select the subset of time slices to update 
at each iteration, and dynamically decide on the num- 
ber of steps and appropriate gradient-based stepping 
methodology for the inner loop may even achieve better 
results than the established two extremes. The success, 
however, depends on developing alternatives to BEGS 
matching with sequential update schemes (s.a.). 

Control Parametrisation Methods: We have looked 
exclusively at piece-wise-constant discretisation of the 
control function in the time domain. — Yet, although 
also frequency-domain methods exist (e.g. |34|). there is 
both ample space to develop further methods and need 
for comparative benchmarking. 

Algorithms for Super- Expensive Goal Functions: For 
many-body quantum systems, ascertaining the time- 
evolved state of the system requires extremely costly 
computational resources. — Yet, algorithms described in 
this manuscript all require some method of ascertaining 
the gradient, by finite differences if no other approach is 
available. Such requirements, however, are mal-adapted 
to super-expensive goal functions. Further research to 
discover new search algorithms excelling in such use cases 
is required. 



We have presented a first step towards establishing a 
"best of breed" toolset for quantum optimal control. It is 
meant to provide the platform for future improvements 
and follow-up studies, e.g., along the following lines: 

Further Types of Applications: We have focussed on 
the synthesis of high-fidelity unitary quantum gates 
in closed systems. — Yet, follow-up comparisons 
should extend to open systems or to spectroscopic state 
transfer, where it is to be anticipated that different 
demand of fidelity may lead to different algorithmic 
recommendations . 

Initial Conditions: Currently there is no systematic 
way how to choose good initial control vectors in a 
problem-adapted way. Scaling of initial conditions has 
been shown to translate into computational speeds 
differing significantly (i.e. up to an order of magnitude). 
— Yet, good guidelines for selecting initial controls are 
still sought for. 

Second-Order methods for sequential update: As has 
been mentioned, we have indications that sequential 
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interval [0, T] and on the control amplitudes u{t), a gen- 
eral quality function may be formulated to take the form 

/ - fT{X(T),T) +J fo{X(t),t,u(t)) dt , 



where fT{X{T),T^ is the component solely depending 
on the final state of the system X{T) and indepen- 
dent of the control amplitudes, while fo (^X{t),t, M(f)) col- 
lects the running costs usually depending on the ampli- 
tudes u{t). In optimal control and variational calculus, 
the general case (/t 7^ 0, /o 7^ 0) is known as prob- 
lem of BoLZA, while the special case of zero running 
costs (/t 7^ 0, fo = 0) is termed problem of Mayer, 
whereas (/t = 0, fo 7^ 0) defines the problem of La- 
grange. — Henceforth, we will not take into account 
any running costs (thereby also allowing for unbounded 
control amplitudes). Thus here all our problems take the 
form of Mayer. On the other hand, many applications of 
KROTOV-type algorithms have included explicit running 
costs [3^ . [33, l4^ . [soi l to solve problems of Bolza form, 
which are also amenable to grape (as has been shown 
in [HI). — Yet, though well known, it should be pointed 
out again that a problem of Bolza can always be trans- 
formed into a problem of Mayer, and ultimately all the 
three types of problems above are equivalent |90l ]. which 
can even be traced back to the pre-control era in the cal- 
culus of variations (9l] ]. The implications for convergence 
of the respective algorithms are treated in detail in [69l ]. 
195] In view of future optimisation, however, note that our 
parallelised CH — h version of grape already uses faster 
methods based on Chebychev polynomials as described 
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Table IV: Specification of Test Problems 



Problem 


Quantum System 


Matrix 


No. of 


Final Time 


Target 






Dimensions Time Slices 


[l/J] 


Gate 


1 


AB Ising-ZZ chain 


4 


30 


2 


CNOT 


2 


AB Ismg-ZZ cnam 


4 


40 


2 


CNOT 


3 


AB Ismg-ZZ cnam 


4 


128 


3 


CNOT 


4 


AB Ising-ZZ chain 


4 


64 


4 


CNOT 


5 


A T) / ' T • Tr? 1 

ABC Ismg-ZZ cham 


8 


120 


6 


QFT 


6 


ABC Ismg-ZZ cham 


8 


140 


7 


QFT 


7 


ABCD Ising-ZZ chain 


16 


128 


10 


QFT 


8 


AnCJJ ismg-ZZ chain 


16 


128 


12 


Ql 1 


9 


ABCD Ismg-ZZ cham 


16 


64 


20 


QFT 


10 


ABODE Ismg-ZZ cham 


32 


300 


15 


QFT 


11 


ABODE Ising-ZZ chain 


32 


300 


20 


QFT 


12 


ABODE Tqinfr-77, rhain 


32 


64 


25 


OFT 


13 


Ca Graph-ZZ 


16 


128 


7 


Vcs 


1 4 


n. PrnnVi 77 
vjiapii-ZjZj 


ID 


iZO 


1 9 


ucs 


1 


IN V -CcliLl C 


/I 


/in 


9 


r'NOT 


16 


NV-centre 


4 


64 


5 


CNOT 


17 


AAAAA Ising-ZZ chain 


32 


1000 


125 


QFT 


18 


AAAAA Ising-ZZ chain 


32 


1000 


150 


QFT 


19 


AAAAA Heisenberg-XXX chain 


32 


300 


30 


QFT 


20 


AOO Heisenberg-XXX chain 


8 


64 


15 


rand U 


21 


ABOO Heisenberg-XXX chain 


16 


128 


40 


rand U 


22 


driven spin- 6 system 


13 


100 


15 


rand U 


23 


driven spin-3 system 


7 


50 


5 


rand U 



A notation ABC means the spin chain consists of three spins that are addressable each by an 
individual set of x- and j/-controls. We write ylO for a locally controllable spin A which is coupled 
to a neighbour not accessible by any control field. 



itia 

1 

2 

3 

4 

5 

6 

7 

8 

9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
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!sults obtained from 20 unconstrained optimisations (f minunc in matlab) for each problem of Tab. IIVI using 
■ the concurrent-update algorithm. Small initial pulse amplitudes were used (mean(iti„i) = 0, std(tii,ii) = 1). 

Algorithm Final Fidelity Wall Time [min] #Eigendecs/1000 #Matrix Mults/1000 

(mean/min/max) (mean/min/max) (mean/min/max) (mean/min/max) 



seq. 



0.9999/0.9999/1.0000 
0.9999/0.9999/0.9999 



0.02/0.01/0.03 
0.19/0.13/0.34 



2.02/1.35/2.94 
6.29/4.36/11.68 



38/25/56 
88/61/163 



cone, 
seq. 



seq. 



seq. 



0.9999/0.9999/1.0000 
0.9999/0.9999/0.9999 

0.9999/0.9999/1.0000 
0.9999/0.9999/0.9999 

0.9999/0.9999/1.0000 
0.9999/0.9999/0.9999 



0.05/0.03/0.08 
0.16/0.11/0.27 

0.05/0.04/0.08 
0.07/0.05/0.12 

0.02/0.01/0.02 
0.05/0.03/0.11 



2.68/1.76/4.44 
5.43/3.80/9.04 

4.61/3.46/7.04 
2.29/1.56/4.12 

1.70/1.28/2.43 
1.72/1.08/3.84 



50/33/84 
76/53/126 

85/63/132 
32/22/57 

31/23/45 
24/15/54 



seq. 



0.9978/0.9973/0.9990 
0.9973/0.9918/0.9986 



7.19/5.84/7.86 

34/22/58 



362/310/367 
1976/1320/3292 



9774/8364/9917 
35542/23729/59209 



seq. 



0.9999/0.9999/0.9999 
0.9999/0.9999/0.9999 



0.85/0.34/2.21 
5.14/1.14/18.72 



35/17/76 
310/68/1143 



954/450/2050 
5574/1216/20554 



cone, 
seq. 



cone, 
seq. 



0.9970/0.9886/0.9999 9.42/2.32/18.48 
0.9945/0.9825/0.9999 242/50/491 



0.9999/0.9999/0.9999 
0.9999/0.9999/0.9999 



2.90/0.83/10.39 
16.11/2.36/65.72 



229/63/391 
3975/1242/7753 

72/21/275 
500/89/2223 



8028/2210/13679 
87385/27313/170455 

2530/735/9627 
11002/1953/48876 



cone. 0.9999/0.9999/0.9999 
seq. 0.9999/0.9999/0.9999 



0.61/0.33/0.92 
4.83/1.91/7.81 



20/11/30 
161/63/259 



685/381/1052 
3536/1375/5696 



cone. 0.9982/0.9740/0.9999 376/12/918 435/82/917 18694/3510/39442 

seq. 0.9959/0.9661/0.9999 2591/244/8458 13123/1312/40136 341107/34116/1043279 



cone, 
seq. 



cone, 
seq. 



0.9999/0.9991/0.9999 
0.9998/0.9988/0.9999 



148/11/1236 
786/72/4817 



0.9996/0.9974/0.9999 62.22/4.48/286.60 
0.9994/0.9956/0.9999 284/56/1842 



189/71/919 
4041/427/17767 

89/22/192 
987/245/4563 



8114/3045/39519 
105031/11097/461821 

3818/942/8276 
25637/6360/118491 



cone, 
seq. 



cone, 
seq. 



0.9989/0.9936/0.9999 5.20/1.52/14.89 
0.9759/0.9373/0.9999 129.00/6.39/439.92 



0.9999/0.9999/0.9999 
0.9999/0.9999/0.9999 



1.45/0.70/2.62 
6.47/1.76/16.06 



138/41/390 
4174/215/15103 

35/19/60 
219/59/547 



4833/1434/13634 
91773/4719/332029 

1235/677/2089 
4813/1292/12033 



cone, 
seq. 



cone, 
seq. 



0.9999/0.9999/1.0000 
0.9999/0.9999/1.0000 

0.9999/0.9999/1.0000 
0.9999/0.9999/1.0000 



0.01/0.00/0.01 
0.02/0.01/0.04 

0.01/0.00/0.01 
0.01/0.01/0.02 



0.90/0.64/1.24 
1.76/0.72/3.28 

0.80/0.70/1.28 
0.67/0.51/1.54 



9.57/6.67/13.30 
17.53/7.16/32.64 

8.19/7.13/13.48 
6.64/5.10/15.31 



cone. 0.9999/0.9999/0.9999 
seq. 0.9999/0.9999/0.9999 



160/27/357 684/616/773 7516/6767/8495 

2582/1411/4638 16577/11490/27082 165733/114877/270766 



cone. 0.9999/0.9999/0.9999 88/13/220 394/286/620 
seq. 0.9999/0.9999/0.9999 492/247/1520 2954/2434/3985 



4330/3137/6811 
29535/24335/39842 



cone. 0.9999/0.9999/0.9999 45.85/8.49/213.95 170/103/264 
seq. 0.9999/0.9999/0.9999 128/76/217 1124/809/1490 



3896/2354/6060 
17978/12945/23822 



seq. 



0.9999/0.9999/0.9999 
0.9999/0.9999/0.9999 



0.06/0.04/0.08 
0.45/0.21/1.02 



6.92/4.80/9.47 
26/15/43 



76/52/104 
258/148/431 



cone. 0.9999/0.9999/0.9999 
seq. 0.9999/0.9999/0.9999 



0.18/0.16/0.20 
1.26/0.82/2.76 



8.56/7.81/9.60 
39/29/57 



161/146/180 
551/410/804 



cone. 0.9999/0.9999/0.9999 0.96/0.47/2.02 68/42/105 750/459/1154 

seq."* 0.9998/0.9994/0.9999 407/112/732 21692/6473/30000 216483/64599/299399 



cone. 0.9999/0.9999/0.9999 0.60/0.24/1.64 53/25/141 
seq. 0.9951/0.9797/0.9995 39.03/9.39/111.74 2992/744/7163 



588/279/1559 
29796/7408/71343 



the stopping conditions were changed for Fig.l31d1. so the data are no longer comparable to Tabs. IVII and lVlTr 



1 

2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
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; obtained from 20 unconstrained optimisations (fminunc in matlab) for each problem of Tab. II Vl usine 
mt-update. Higher initial pulse amplitudes (mean(iti„i) — 0, std{uini) = 10) than in Tab. FVl were used. 

Algorithm Final Fidelity Wall Time [min] #Eigendecs/1000 #Matrix Mults/1000 

(mean/min/max) (mean/min/max) (mean/min/max) (mean/min/max) 



cone, 
seq. 



eone. 
seq. 



cone, 
seq. 



cone, 
seq. 



0.9999/0.9999/0.9999 
0.9999/0.9998/0.9999 

0.9999/0.9999/0.9999 
0.9999/0.9999/0.9999 

0.9999/0.9999/1.0000 
0.9999/0.9999/0.9999 

0.9999/0.9999/1.0000 
0.9999/0.9999/0.9999 



0.04/0.02/0.06 
1.54/0.43/3.78 

0.05/0.02/0.08 
1.38/0.42/3.28 

0.07/0.05/0.10 
0.22/0.12/0.49 

0.02/0.01/0.03 
0.18/0.05/0.54 



4.25/2.61/6.60 
118/33/289 

5.39/2.76/8.56 
109/33/261 

6.06/4.35/8.45 
17.29/9.73/38.53 

2.50/1.60/3.39 
13.79/4.22/42.18 



80/49/125 
1645/459/4023 

102/52/162 
1520/464/3635 

113/80/158 
242/136/539 

46/29/63 
193/59/589 



cone, 
seq. 



cone, 
seq. 



0.9976/0.9959/0.9986 
0.9969/0.9952/0.9983 

0.9999/0.9999/0.9999 
0.9999/0.9999/0.9999 



6.97/6.24/7.32 
73/51/105 

2.23/1.36/3.82 
16/11/32 



364/362/370 
4246/2954/6021 

105/60/180 
935/614/1866 



9839/9784/9995 
76349/53121/108271 

2842/1623/4860 
16823/11049/33567 



cone, 
seq. 



cone, 
seq. 



cone, 
seq. 



0.9893/0.9366/0.9999 
0.9928/0.9444/0.9993 



15/14/16 
325/129/730 



0.9998/0.9984/0.9999 9.86/4.27/15.02 
0.9990/0.9851/0.9999 158/43/645 



0.9999/0.9999/0.9999 
0.9996/0.9995/0.9998 



2.81/1.80/4.62 
41/22/104 



386/385/387 
8053/4190/16630 

257/110/386 
3511/1400/13269 

87/57/142 
1057/693/2033 



13499/13469/13563 
177047/92125/365595 

9000/3832/13495 
77189/30788/291716 

3056/2007/4982 
23223/15223/44653 



cone, 
seq. 



cone, 
seq. 



cone, 
seq. 



0.9978/0.9834/0.9999 638/91/1566 

0.9995/0.9974/0.9999 3213/529/8282 

0.9999/0.9998/0.9999 449/37/1165 

0.9999/0.9998/0.9999 1557/863/2935 



0.9984/0.9948/0.9999 
0.9974/0.9911/0.9990 



197/16/416 
883/255/2060 



798/402/944 
16045/6914/33844 

613/335/906 
8408/4895/14865 

196/192/202 
4320/1994/9522 



34315/17276/40590 
417076/179721/879708 

26342/14386/38939 
218564/127232/386383 

8426/8273/8678 
112192/51780/247266 



cone, 
seq. 



cone, 
seq. 



0.9999/0.9999/0.9999 2.65/1.74/4.43 
0.9999/0.9999/0.9999 16.31/9.63/31.65 



0.9999/0.9999/0.9999 
0.9999/0.9999/0.9999 



1.48/1.09/1.94 
5.25/3.90/6.50 



64/47/107 
520/310/1040 

40/33/48 
166/126/207 



2247/1636/3729 
11427/6804/22872 

1405/1152/1676 
3654/2772/4550 



cone, 
seq. 



cone, 
seq. 



0.9999/0.9999/1.0000 
0.9999/0.9999/1.0000 

0.9999/0.9999/1.0000 
0.9999/0.9999/1.0000 



0.00/0.00/0.01 
0.01/0.01/0.02 

0.00/0.00/0.01 
0.01/0.01/0.02 



0.55/0.40/0.76 
0.75/0.52/1.60 

0.76/0.58/1.22 
0.58/0.45/1.15 



5.70/4.02/8.00 
7.44/5.17/15.92 

7.73/5.71/12.77 
5.77/4.47/11.48 



cone, 
seq. 



cone, 
seq. 



0.9999/0.9999/0.9999 
0.9999/0.9999/0.9999 

0.9999/0.9999/0.9999 
0.9999/0.9999/0.9999 



162/28/320 
1346/763/2603 

118/24/309 
897/428/1152 



616/536/750 
9238/7502/13230 

522/400/652 
6788/5799/8207 



6768/5887/8242 
92361/75005/132274 

5736/4391/7163 
67863/57978/82054 



cone. 0.9999/0.9999/0.9999 41.77/5.48/120.52 65/58/71 1481/1332/1636 
seq. 0.9999/0.9999/0.9999 59/32/89 460/398/547 7354/6362/8747 



cone. 0.9998/0.9991/0.9999 
seq. 0.9980/0.9879/0.9996 



1.91/0.51/6.98 
64/29/103 



139/47/202 
3508/2098/6647 



1529/517/2225 
34972/20910/66265 



cone. 0.9999/0.9999/0.9999 
seq. 0.9998/0.9998/0.9999 



1.50/1.18/2.07 
118/84/164 



70/53/96 
4269/2860/5791 



1328/1005/1818 
59697/39992/80980 



cone. 0.9999/0.9999/0.9999 
seq. 0.9995/0.9982/0.9998 



0.58/0.32/0.90 

81/35/137 



51/29/74 
4128/1981/6114 



563/317/820 
41194/19771/61017 



cone. 0.9999/0.9999/0.9999 
seq. 0.9999/0.9999/0.9999 



O.O6/0.05/0.07 
1.20/0.59/2.24 



7.58/6.20/9.85 
93/46/171 



83/68/108 
925/458/1702 
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Its obtained from 20 constrained optimisations (fmincon in matlab) for each problem of Tab. II VI using the 
icurrent-update algorithm. Small initial pulse amplitudes were used (mean(iti„i) = 0, std(iti„i) — 1). 

Algorithm Final Fidelity Wall Time [min] #Eigendecs/1000 #Matrix Mults/1000 

(mean/min/max) (mean/min/max) (mean/min/max) (mean/min/max) 



cone, 
seq. 



eone. 
seq. 



eonc. 
seq. 



cone, 
seq. 



0.9999/0.9999/1.0000 
0.9999/0.9999/0.9999 

0.9999/0.9999/0.9999 
0.9999/0.9999/0.9999 

0.9999/0.9999/1.0000 
0.9999/0.9999/0.9999 

0.9999/0.9999/1.0000 
0.9999/0.9999/0.9999 



0.11/0.02/0.26 
0.10/0.04/0.22 

0.05/0.03/0.10 
0.08/0.05/0.09 

0.09/0.08/0.13 
0.08/0.03/0.15 

0.03/0.02/0.04 
0.03/0.01/0.04 



1.43/1.23/1.68 
6.17/2.70/10.92 

2.10/1.64/2.52 
5.74/4.04/6.96 

4.49/3.71/5.50 
5.13/2.05/8.06 

1.60/1.34/1.98 
1.89/1.02/2.75 



27/23/31 
86/38/152 

39/31/47 
80/56/97 

83/68/102 
72/29/113 

29/24/37 
26/14/38 



cone, 
seq. 



cone, 
seq. 



0.9877/0.9322/0.9990 
0.9973/0.9918/0.9986 

0.9999/0.9999/0.9999 
0.9999/0.9999/0.9999 



44/24/67 
34/22/61 

1.87/0.73/3.61 
5.34/1.15/19.60 



364/361/368 
1976/1320/3292 

24/18/40 
310/68/1143 



9828/9759/9947 
35542/23729/59209 

650/473/1074 
5574/1216/20554 



cone. 0.9958/0.9808/0.9999 

seq. 0.9945/0.9825/0.9999 

cone. 0.9999/0.9999/0.9999 

seq. 0.9999/0.9999/0.9999 

cone. 0.9999/0.9999/0.9999 

seq. 0.9999/0.9999/0.9999 



29.82/5.74/69.99 
123/39/244 

5.39/2.08/20.83 
15.19/2.66/67.58 

1.21/0.69/1.68 
4.90/1.91/7.99 



244/69/390 
3978/1242/7749 

49/29/198 
500/89/2223 

14.94/8.19/21.06 
161/63/259 



8552/2411/13634 
87443/27313/170360 

1697/995/6925 
11002/1953/48876 

521/285/735 
3536/1375/5696 



cone, 
seq. 



cone, 
seq. 



cone, 
seq. 



0.9998/0.9985/0.9999 281/21/1753 
0.9991/0.9864/0.9999 1942/122/12208 



355/94/904 15251/4052/38874 
11549/988/84232 300198/25679/2189468 



0.9999/0.9999/0.9999 153.76/7.80/1104.29 144/68/566 6182/2890/24346 
0.9998/0.9988/0.9999 1141/86/9848 6990/427/74922 181706/11097/1947465 



0.9999/0.9992/0.9999 76.27/5.76/948.99 70/22/194 
0.9997/0.9970/0.9999 1108/39/5.566 5054/245/19200 



3017/936/8331 
131246/6360/498598 



cone, 
seq. 



cone, 
seq. 



0.9844/0.9102/0.9999 13.91/2.49/75.62 
0.9759/0.9373/0.9999 128.25/6.64/454.86 



0.9973/0.9867/0.9999 
0.9999/0.9999/0.9999 



7.47/2.06/14.20 
6.58/1.77/16.61 



120/27/398 
4174/215/15103 

49/29/80 
219/59/547 



4189/950/13926 
91773/4719/332029 

1720/1000/2801 
4813/1292/12033 



cone, 
seq. 



cone, 
seq. 



0.9999/0.9999/1.0000 
0.9999/0.9999/1.0000 

0.9999/0.9999/1.0000 
0.9999/0.9999/1.0000 



0.05/0.02/0.10 
0.02/0.01/0.04 

0.04/0.01/0.07 
0.01/0.01/0.02 



0.71/0.52/0.92 
1.76/0.72/3.28 

0.96/0.77/1.34 
0.67/0.51/1.54 



7.49/5.35/9.77 
17.53/7.16/32.64 

9.99/7.83/14.19 
6.64/5.10/15.31 



cone, 
seq. 



cone, 
seq. 



0.9999/0.9999/0.9999 
0.9999/0.9999/0.9999 

0.9999/0.9999/0.9999 
0.9999/0.9999/0.9999 



531/58/1443 
2284/1054/3898 

157/26/754 
386/175/690 



1224/1032/1551 13454/11344/17054 
16774/11490/27294 167710/114877/272885 



574/530/655 
2953/2434/3985 



6300/5821/7196 
29524/24335/39842 



cone. 0.9999/0.9999/0.9999 105/16/335 166/141/186 
seq. 0.9999/0.9999/0.9999 143/64/328 1130/996/1465 



3807/3244/4273 
18064/15925/23433 



cone. 0.9999/0.9999/0.9999 
seq. 0.9999/0.9999/0.9999 



0.53/0.12/1.14 
0.45/0.23/0.77 



5.15/4.16/6.91 

30/16/51 



56/45/76 
302/160/511 



cone. 0.9999/0.9999/0.9999 
seq. 0.9999/0.9999/0.9999 



0.49/0.36/0.89 
1.39/0.79/3.80 



9.53/9.09/10.37 
39/29/57 



179/171/195 
551/410/804 



cone. 0.9999/0.9999/0.9999 
seq. 0.9991/0.9983/0.9995 



5.34/2.39/8.03 
108/59/386 



131/93/193 
4702/3317/6780 



1444/1026/2128 
46924/33106/67669 



cone. 0.9999/0.9999/0.9999 2.26/0.42/9.57 38/13/83 420/148/913 

seq. 0.9951/0.9797/0.9995 37.38/9.40/97.52 2991/744/7163 29786/7408/71343 
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VI. APPENDIX 
A. Exact Gradients (Eqn. 

For deriving the gradient expression in Eqn. ([24| , we fol- 
low [zO,[Zl|- Note that by 



dX d 

— = — exp{-^A^(i^d + (uj + u)Hj + V u„H^)} 



9uj du 

d 



= — exp{-^A^(i^„ + uiJj)} 



(60) 



one may invoke the spectral theorem in a standard way 
and calculate matrix functions via the eigendecoposition. 
For an arbitrary pair of Hermitian (non-commuting) ma- 
trices A,B and a; G M , take {[A^)} as the orthonormal 
eigenvectors to the eigenvalues {Ay} of A to obtain the 
following straightforward yet lengthy series of identities 



x=0 



Am) 

n=0 

(AH £ ^ + ^By-'B{A + xBy-'\\„,) 

oo _j n 
oo ^ n 

oo ^ 71 



x=0 



11=0 q=l 



(61) 



already explaining the case A; = A„i, while for A; ^ X^, 
we have 



oo _j n 

D^{Xi\B\X^}J2-X^^''Yl 

oo -J 

= (AHS|A™)^-A 



1 (A(/Am)' 



9-1 



n=0 
oo 



{Xl/Xm) - 1 



(62) 



1 \n \n 

(AHS|Am)E^^^ 

^ n! A; ~ Am 

n— 

(A/|S|A, 



A/ — A,i 



An analogous result holds for skew-Hermitian iA, iB. So 
substituting A ^ —iAtHu and xB H> —iAt uHj as well 
as Aj, I— 7- — lAiAi, for ly ^ l,m while keeping the eigenvec- 
tors |Ai^) readily recovers Eqn. Note that we have 
explicitly made use of the orthogonality of eigenvectors 



to different eigenvalues in Hermitian (or more generally 
normal) matrices A, B. Hence in generic open quantum 
systems with a non-normal Lindbladian, there is no such 
simple extension for calculating exact gradients. 



B. Standard Settings in a Nutshell 

For convenience, here we give the details for six stan- 
dard tasks of optimising state transfer or gate synthesis. 
The individual steps give the key elements of the core 
algorithm in Sec. IH Dl and its representation as a flow 
chart. 

Task 1: Approximate Unitary Target Gate up to Global 
Phase in Closed Systems 

Define boundary conditions Xq 1, Xm+i ■= t^argot; 
fix final time T and digitisation A/ so that T = M At. 

(a) set initial control amplitudes u^^\tk) CM. for 
aU times tk with k € T^^^ := {1,2,..., M}; 

(b) exponentiate Uk ^ e-"^*"^^"^ for all k e T^'') with 
Hk ■^Hd + EjUj{tk)H, ; 

(c) calculate forward-propagation 

Uk:0 '■= UkUk-l ■ ■ ■ UiUq 

(d) calculate back-propagation 

(e) evaluate fidelity / = where 

9 '■= J^'^"^{^U+\:k+\^k:'d) = {^tar^jl/:o} 

and stop if / > 1 - etiucshoid or iteration r > nimit- 

(f) evaluate gradients for all k G T''^-' 

-j^Ketr|e ^'^ I^M+i:k+i{-d^)'^k~i:a) 

with 1^ of Eqn. ^ and e-*'^9 g* /\g\; 

(g) update amplitudes for all k € T'-'"-' by quasi-Newton 
„f+i)(t,) = u'^pitk) + akUl' or other 
methods (as in the text); 

(h) while > /j'jnjjt for some k G T^''^ re-iterate; 
else re-iterate with new set T"*-''"''^-'. 

Comments: Algorithmic scheme for synthesising a uni- 
tary gate U{T) such as to optimise the gate fidelity 

/ |-^ tr{[//jjj.getC^(2^)}|- This setting automatically ab- 
sorbs global phase factors as immaterial: tracking for 
minimal times T^, to realise f/target up to an undeter- 
mined phase automatically gives a e"^*C/targot G SU(N) 
of fastest realisation. 
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Task 2: Approximate Unitary Target Gate Sensitive to 
Global Phase in Closed Systems 

Fix boundary conditions Xq := 1, Xm+i e"^J7targot by 
choosing global phase to ensure det(e"^C/targot) = +1 so 
that e^-^t/targot e SU(N); there are N such choices [l3|; 
fix final time T and digitisation M so T = MAt. 

(a) through (d) as in Task 1. 

(e) evaluate fidelity 

and stop if / > 1 - etiucshoid or iteration r > nimit- 

(f) evaluate gradients for all k e T^*^-* 

MH£id)=^Retr{AU,,^,(fi)t/.._,o} with 
^ ofEqn. (El; 

(g) and (h) as in Task 1. 

Comments: Algorithmic scheme for synthesising a uni- 
tary gate U{T) in closed quantum systems such as to op- 
timise the gate fidelity / ■^RctT{e''■'f'U|^^.g^^U{T)}. 
This setting is sensitive to global phases (f> that have to 
be specified in advance. Warning: whenever drift and 
control Hamiltonians operate on different time scales, the 
minimal time required to realise e"^C/targct S SU(N) 
will typically and significantly depend on (jj as demon- 
strated in jl2l |. a problem eliminated by Task 1. 

Task 3: Optimise State Transfer between Pure-State 
Vectors 

Define boundary conditions Xq :~ |V'o)i Xm+i ■= 
IV') target! fix final time T and digitisation M so that 
T = MAt. 

(a) set initial control amplitudes u^°''(tfe) £ Z-/ C M 

(b) exponentiate Uk = e-'^*"^*"'' for all k e T^''' with 

(c) calculate forward-propagation 

\Mtk)) ■.= UkUk-i---uM 

(d) calculate back-propagation 

(■0tar(ifc)| := {'iptia]UMUM-l ' ' ' Uk+1 

(e) evaluate fidelity /, where 
f ■.= Re{AM\Mtk)) 

= Re (V'tar|([/A/---f/fc---!7i|^o)) 
and stop if / > 1 - etiucshoid or iteration r > nimit- 

(f) evaluate gradients for all k £ T^*^-* 

df{U{tk)) ^ 

Uj 

= Re (V-tarKC/M • • • Uk+iC^^pk-i ■■■Ui IV^o)) 
again with of Eqn. 



(g) update amplitudes for all k S T^''-' by quasi-Newton 
4'-+i)(t,) = u^pitu) + auU-,' '-1^^ or other 
methods (as in the text); 

(h) while > /liujit for some k g T'-'"' re-iterate; 
else re-iterate with new set 'J'^'^+^\ 

Comments: Algorithmic scheme for optimising (pure) 
state-to-state transfer in closed quantum systems. This 
setting is sensitive to global phases (j) in e''^|-!/') that have 
to be specified in advance. 

Task 4: Optimise State Transfer between Density Oper- 
ators in Closed Systems 

Define boundary conditions Xq '.= po, A^iz+i := Ptargot; 
fix final time T and digitisation M so that T ^ M At. 

(a) and (b) as in Tasks 1 through 3. 

(c) calculate forward-propagation 

Po{tk) UkUk-i ■ ■ ■ UipoUl ■ ■ ■ Ul_,ul 

(d) calculate back-propagation 

pLritk) ■■= C/^+i ■ ■ ■ uIj_^uIipI^,UmUm-i ■ ■ ■ Uk+1 

(e) evaluate fidelity / with normalisation c := Uptai Hi 
/:=iRetr{pLr(ife)Po(^fc)} 

and stop if / > 1 - etiucshoid or iteration r > nimit- 

(f) evaluate gradients for all k G T''^-' 

^« = ^.Re{tr{pUtkK^)Mtk-i)Ul} + 

tr{pUtk)UkPo{h^i)C^)}) with 1^ of 
Eqn. (El; 

(g) and (h) as in Tasks 1 through 3; 

Comments: Algorithmic scheme for optimising state- 
to-state transfer of density operators in closed quantum 
systems. 

Task 5: Approximate Unitary Target Gate by Quantum 
Map in Open Markovian Systems 

Define boundary conditions Xq := 1, Xm+i ■= t^argct; 
fix final time T and digitisation AI so that T = M At. 

(a) set initial control amplitudes (tk) £ U CM. for 
aU times tk with k e T*") := {1,2,..., M}; 

(b) exponentiate Xk = e-*'^*^(*'=)+^ for all k e T^''^ 
with Hk := Ho + J2j Uj{tk)Hj ; 

(c) calculate forward-propagation 
Xk:0 XkXk^i ■ ■ ■ Xi{Xq = 1); 

(d) calculate back-propagation 
M/+i:fc+i ■= uI^.^XmXm~i ■ ■ ■ Xk+1 
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(e) evaluate fidelity 

Retr {A\j+i.,k+iXk:o} = W^ ^etr {ul.X^j.^} 
and stop if / > 1 - ethrcshoid or iteration r > rumit. 

« ^Retr{AU,,^,(*ff.^ + £-)X,..o} 

(g) update amplitudes for all k G T^'"' by quasi-Newton 
u<f+'\t,) = uf\t,) + akUZ' or other 
methods (as in the text); 

(h) while llf^^ll > /I'iniit for some k G T'^'') re-iterate; 
else re-iterate with new set 'T^'^+^\ 



(a) and (b) as in Task 5. 

(c) calculate forward-propagation 
Xk:0 ■= XkXk-i ■ • -Xi vcc(po); 

(d) calculate back-propagation 

^M+i:fc+i ■= ^cc\p\^^)XmXm-\ ■ ■ ■ Xk+i 

(e) evaluate fidelity 

and stop if / > 1 - ethrcshoid or iteration r > rumit- 



Comments: General algorithmic scheme for synthesis- 
ing quantum maps X{T) at fixed time T with optimised 

the gate fidelity f :— Rctr{J7targct-'^(7')} in open dis- 
sipative quantum systems. X(t) denote Markovian quan- 
tum maps generated in step 1. 



(f) approximate gradients for all k e T^'"-' 

« ^Retr{AU,,^,(^i^„^ + £-)X,..,] 

(g) and (h) as in Task 5. 



Task 6: Optimise State Transfer between Density Oper- 
ators in Open Systems 

Define boundary conditions by vectors in Liouville space 

Xq := vec(po) and xl^^ := vec*(pJargot)j final time T 
and digitisation M so that T = MAt. 



Comments: Algorithmic scheme for optimising state 
transfer between density operators in open Markovian 
quantum systems, where the representation in Liouville 
space is required. So Task 6 can be seen as the rank-1 
version of Task 5. 
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C. Further Numerical Results 





wall time [min] wall time [min] 



Figure 10: (Colour) Performance of a broader variety of first-order schemes compared to the l-bfgs concurrent update. The 
red traces show the plain Krotov sequential first-order update, while the first-order concurrent update is given in magenta and a 
first-order hybrid (with block size of 5) is given in green. For comparison, the second- order l-bfgs concurrent update is shown 
in blue. The test examples are again taken from (see Tab. II VI and Sec. IIV A ip with mean{ui„i) — and std(uini) = 1 in units 
of 1/J. Note that in the (simpler) problems 1, 3, and 4 original Krotov performs fastest among all the first-order methods, 
while in problems 15, 16, and 20 the variance within each method comes much closer to the variance among the methods so 
that in problem 16 the first-order concurrent scheme outperforms the sequential one. 



